chore: update torch v2.5.1 #1849

[
  {
    "timestamp": "2024-11-17T07:25:49.048134",
    "model": "meta-llama/Llama-3.1-8B-Instruct",
    "metrics": {
      "en": 0.836,
      "en:std": 0.3702755730533679,
      "group_latin": 0.836,
      "group_latin:std": 0.3702755730533679,
      "score:std": 0.3702755730533679,
      "score": 0.836
    },
    "score": 0.836
  },
  {
    "timestamp": "2024-11-17T07:26:37.000698",
    "model": "mistralai/Mistral-7B-Instruct-v0.3",
    "metrics": {
      "en": 0.604,
      "en:std": 0.48906441293555597,
      "group_latin": 0.604,
      "group_latin:std": 0.48906441293555597,
      "score:std": 0.48906441293555597,
      "score": 0.604
    },
    "score": 0.604
  },
  {
    "timestamp": "2024-11-17T07:28:02.777706",
    "model": "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct",
    "metrics": {
      "en": 0.876,
      "en:std": 0.3295815528818322,
      "group_latin": 0.876,
      "group_latin:std": 0.3295815528818322,
      "score:std": 0.3295815528818322,
      "score": 0.876
    },
    "score": 0.876
  },
  {
    "timestamp": "2024-11-17T07:29:14.852606",
    "model": "google/gemma-2-27b-it",
    "metrics": {
      "en": 0.924,
      "en:std": 0.26499811320083017,
      "group_latin": 0.924,
      "group_latin:std": 0.26499811320083017,
      "score:std": 0.26499811320083017,
      "score": 0.924
    },
    "score": 0.924
  },
  {
    "timestamp": "2024-11-17T07:32:26.439859",
    "model": "meta-llama/Llama-3.1-70B-Instruct",
    "metrics": {
      "en": 0.976,
      "en:std": 0.15304901175767194,
      "group_latin": 0.976,
      "group_latin:std": 0.15304901175767194,
      "score:std": 0.15304901175767194,
      "score": 0.976
    },
    "score": 0.976
  },
  {
    "timestamp": "2024-11-17T07:34:16.554962",
    "model": "mistralai/Mixtral-8x7B-Instruct-v0.1",
    "metrics": {
      "en": 0.648,
      "en:std": 0.4775939698111775,
      "group_latin": 0.648,
      "group_latin:std": 0.4775939698111775,
      "score:std": 0.4775939698111775,
      "score": 0.648
    },
    "score": 0.648
  },
  {
    "timestamp": "2024-11-17T07:35:54.798802",
    "model": "Qwen/Qwen2-57B-A14B-Instruct",
    "metrics": {
      "en": 0.884,
      "en:std": 0.32022492095400695,
      "group_latin": 0.884,
      "group_latin:std": 0.32022492095400695,
      "score:std": 0.32022492095400695,
      "score": 0.884
    },
    "score": 0.884
  },
  {
    "timestamp": "2024-11-17T07:37:30.045613",
    "model": "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct",
    "metrics": {
      "en": 0.86,
      "en:std": 0.3469870314579494,
      "group_latin": 0.86,
      "group_latin:std": 0.3469870314579494,
      "score:std": 0.3469870314579494,
      "score": 0.86
    },
    "score": 0.86
  },
  {
    "timestamp": "2024-11-17T07:38:26.084096",
    "model": "neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8",
    "metrics": {
      "en": 0.88,
      "en:std": 0.32496153618543844,
      "group_latin": 0.88,
      "group_latin:std": 0.32496153618543844,
      "score:std": 0.32496153618543844,
      "score": 0.88
    },
    "score": 0.88
  },
  {
    "timestamp": "2024-11-17T07:39:14.548670",
    "model": "neuralmagic/Mistral-7B-Instruct-v0.3-FP8",
    "metrics": {
      "en": 0.552,
      "en:std": 0.4972886485734417,
      "group_latin": 0.552,
      "group_latin:std": 0.4972886485734417,
      "score:std": 0.4972886485734417,
      "score": 0.552
    },
    "score": 0.552
  },
  {
    "timestamp": "2024-11-17T07:40:51.686684",
    "model": "neuralmagic/DeepSeek-Coder-V2-Lite-Instruct-FP8",
    "metrics": {
      "en": 0.888,
      "en:std": 0.31536645351083237,
      "group_latin": 0.888,
      "group_latin:std": 0.31536645351083237,
      "score:std": 0.31536645351083237,
      "score": 0.888
    },
    "score": 0.888
  },
  {
    "timestamp": "2024-11-17T07:41:42.733429",
    "model": "neuralmagic/gemma-2-2b-it-FP8",
    "metrics": {
      "en": 0.612,
      "en:std": 0.4872945721019269,
      "group_latin": 0.612,
      "group_latin:std": 0.4872945721019269,
      "score:std": 0.4872945721019269,
      "score": 0.612
    },
    "score": 0.612
  },
  {
    "timestamp": "2024-11-17T07:43:15.627071",
    "model": "neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8",
    "metrics": {
      "en": 0.964,
      "en:std": 0.18629009635512025,
      "group_latin": 0.964,
      "group_latin:std": 0.18629009635512025,
      "score:std": 0.18629009635512025,
      "score": 0.964
    },
    "score": 0.964
  },
  {
    "timestamp": "2024-11-17T07:44:52.078680",
    "model": "neuralmagic/Mixtral-8x7B-Instruct-v0.1-FP8",
    "metrics": {
      "en": 0.624,
      "en:std": 0.48438001610305936,
      "group_latin": 0.624,
      "group_latin:std": 0.48438001610305936,
      "score:std": 0.48438001610305936,
      "score": 0.624
    },
    "score": 0.624
  },
  {
    "timestamp": "2024-11-17T07:46:25.651174",
    "model": "neuralmagic/Qwen2-72B-Instruct-FP8",
    "metrics": {
      "en": 0.948,
      "en:std": 0.2220270253820467,
      "group_latin": 0.948,
      "group_latin:std": 0.2220270253820467,
      "score:std": 0.2220270253820467,
      "score": 0.948
    },
    "score": 0.948
  },
  {
    "timestamp": "2024-11-17T07:48:11.049916",
    "model": "neuralmagic/Qwen2-57B-A14B-Instruct-FP8",
    "metrics": {
      "en": 0.824,
      "en:std": 0.380820167533181,
      "group_latin": 0.824,
      "group_latin:std": 0.380820167533181,
      "score:std": 0.380820167533181,
      "score": 0.824
    },
    "score": 0.824
  },
  {
    "timestamp": "2024-11-17T07:50:01.784188",
    "model": "neuralmagic/DeepSeek-Coder-V2-Lite-Instruct-FP8",
    "metrics": {
      "en": 0.876,
      "en:std": 0.3295815528818322,
      "group_latin": 0.876,
      "group_latin:std": 0.3295815528818322,
      "score:std": 0.3295815528818322,
      "score": 0.876
    },
    "score": 0.876
  },
  {
    "timestamp": "2024-11-17T07:51:11.661882",
    "model": "hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
    "metrics": {
      "en": 0.848,
      "en:std": 0.35902089075707005,
      "group_latin": 0.848,
      "group_latin:std": 0.35902089075707005,
      "score:std": 0.35902089075707005,
      "score": 0.848
    },
    "score": 0.848
  },
  {
    "timestamp": "2024-11-17T07:52:26.298293",
    "model": "hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4",
    "metrics": {
      "en": 0.852,
      "en:std": 0.35509998591945907,
      "group_latin": 0.852,
      "group_latin:std": 0.35509998591945907,
      "score:std": 0.35509998591945907,
      "score": 0.852
    },
    "score": 0.852
  }
]

The results of the nightly eval gsm8k on NVIDIA Cloud H100 are OK, but the CIs are not very stable, so ignore them for now.

zhyncs · 2024-11-17T16:10:46Z

BTW I only verified on NVIDIA Cloud H100, may you help verify on AMD and Intel cc @HaiShaw @liangan1

zhyncs · 2024-11-17T16:26:52Z

After this PR is merged, all other CIs passed except https://github.com/sgl-project/sglang/actions/runs/11880304446/job/33103293929

Traceback (most recent call last):
  File "/actions-runner/_work/sglang/sglang/test/srt/test_triton_attention_backend.py", line 29, in test_latency
Writing report to /tmp/mmlu_meta-llama_Llama-3.1-8B-Instruct.html
    assert output_throughput > 153, f"{output_throughput=}"
AssertionError: output_throughput=151.8

The results are very close and can be temporarily ignored, use another issue #2059 to track.

This reverts commit 3b87886.

zhyncs marked this pull request as draft October 31, 2024 02:49

zhyncs force-pushed the upd branch from e19e840 to 084b5d5 Compare October 31, 2024 10:17

jerryzh168 reviewed Oct 31, 2024

View reviewed changes

python/pyproject.toml Outdated Show resolved Hide resolved

merrymercy force-pushed the main branch from 55311eb to 2134f08 Compare November 2, 2024 01:26

merrymercy assigned zhyncs Nov 9, 2024

zhyncs force-pushed the upd branch from 084b5d5 to 74160e6 Compare November 15, 2024 09:38

zhyncs marked this pull request as ready for review November 15, 2024 09:38

zhyncs mentioned this pull request Nov 15, 2024

Release binary for torch 2.5 flashinfer-ai/flashinfer#575

Closed

zhyncs marked this pull request as draft November 15, 2024 09:43

zhyncs force-pushed the upd branch from 74160e6 to 13a5c28 Compare November 16, 2024 13:53

fix

9fb5674

zhyncs force-pushed the upd branch from 1bfd509 to 9fb5674 Compare November 16, 2024 16:24

fix quant :-)

8e304aa

zhyncs marked this pull request as ready for review November 16, 2024 19:27

zhyncs requested review from merrymercy, Ying1123, hnyls2002, ispobock and ByronHsu as code owners November 16, 2024 19:27

zhyncs added 2 commits November 17, 2024 02:43

fix update weights

8213a23

Merge branch 'main' into upd

6646d0a

zhyncs commented Nov 17, 2024

View reviewed changes

test/srt/test_nightly_gsm8k_eval.py Outdated Show resolved Hide resolved

zhyncs commented Nov 17, 2024

View reviewed changes

python/sglang/srt/layers/activation.py Show resolved Hide resolved

python/sglang/srt/model_executor/model_runner.py Show resolved Hide resolved

python/sglang/srt/model_executor/model_runner.py Show resolved Hide resolved

python/sglang/srt/utils.py Show resolved Hide resolved

zhyncs commented Nov 17, 2024

View reviewed changes

test/srt/test_nightly_gsm8k_eval.py Show resolved Hide resolved

zhyncs force-pushed the upd branch from 4fcf9e2 to 823f26a Compare November 17, 2024 14:54

upd

f995c0b

zhyncs force-pushed the upd branch from 823f26a to f995c0b Compare November 17, 2024 15:03

tmp skip

3489084

zhyncs merged commit 3b87886 into main Nov 17, 2024
9 of 12 checks passed

zhyncs deleted the upd branch November 17, 2024 16:06

zhyncs mentioned this pull request Nov 17, 2024

[Bug] torch 2.5.1 upgrade performance issue #2059

Closed

5 tasks

merrymercy added a commit that referenced this pull request Nov 17, 2024

Revert "chore: update torch v2.5.1 (#1849)"

6011772

This reverts commit 3b87886.

merrymercy mentioned this pull request Nov 17, 2024

Revert "chore: update torch v2.5.1" #2063

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: update torch v2.5.1 #1849

chore: update torch v2.5.1 #1849

zhyncs commented Oct 31, 2024

merrymercy commented Oct 31, 2024

fengyang95 commented Nov 15, 2024

zhyncs commented Nov 15, 2024

zhyncs commented Nov 16, 2024

zhyncs commented Nov 16, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

chore: update torch v2.5.1 #1849

chore: update torch v2.5.1 #1849

Conversation

zhyncs commented Oct 31, 2024

Motivation

Modifications

Checklist

merrymercy commented Oct 31, 2024

fengyang95 commented Nov 15, 2024

zhyncs commented Nov 15, 2024

zhyncs commented Nov 16, 2024

zhyncs commented Nov 16, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024

zhyncs commented Nov 17, 2024